-
-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make gix_path::env:shell()
more robust and use it in gix-command
#1862
base: main
Are you sure you want to change the base?
Conversation
`gix_command::Prepare` previously used `sh` on Windows rather than first checking for a usable `sh` implementation associated with a Git for Windows installation. This changes it to use the `gix-path` facility for finding what is likely the best 'sh' implementation for POSIX shell scripts that will operate on Git repositories. This increases consistency across how different crates find 'sh', and brings the benefit of preferring the Git for Windows `sh.exe` on Windows when it can be found reliably.
Because `gix-command` uses `gix_path::env::shell()` to find sh, and `gix-credentials` uses `gix-command`.
This makes the path returned by `gix_path::env::shell()` on Windows more usable by: 1. Adding components with `/` separators. While in principle a `\` should work, the path of the shell itself is used in shell scripts (script files and `sh -c` operands) that may not account for the presence of backslashes, and it is also harder to read paths with `\` in contexts where it appears escaped, which may include various messages from Rust code and shell scripts. The path before what we add will already use `/` and never `\`, unless `GIT_EXEC_PATH` has been set to a strange value, because it is based on `git --exec-path`, which by default gives a path with `/` separators. Thus, ensuring that the part we add uses `/` should be sufficient to get a path without `\` in all cases when it is clearly reasonable to do so. This therefore also usually increases stylistic consistency of the path, which is another factor that makes it more user-friendly in messages. This is needed to get tests to pass since changing `gix-command` to use `gix_path::env::shell()` on Windows, where a path is formatted in away that sometimes quotes `\` characters. Their expectations could be adjusted, but it seems likely that various other software, much of which may otherwise be working, has similar expectations. Using `/` instead of `\` works whether `\` is expected to be displayed quoted or not. 2. Check that the path to the shell plausibly has a shell there, only using it if it a file or a non-broken file symlink. When this is not the case, the fallback short name is used instead. 3. The fallback short name is changed from `sh` to `sh.exe`, since the `.exe` suffix is appended in other short names on Windows, such as `git.exe`, as well as being part of the filename component of the path we build for the shell when using the implementation provided as part of Git for Windows. Those changes only affect Windows. This also adds tests for (1) and (2) above, as well as for the expectation that we get an absolute path, to make sure we don't build a path that would be absolute on a Unix-like system but is relative on Windows (a path that starts with just one `/` or `\`). These tests are not Windows-specific, since all these expectations should already hold on Unix-like systems, where currently we are using the hard-coded path `/bin/sh`, which is an absolute path on those systems. (Some Unix-like systems may technically not have `/bin/sh` or it may not be the best path to use for a shell that should be POSIX-compatible, but we are already relying on this, and handling that better is outside the scope of the changes here.)
Thanks you very much for tackling this! I didn't look at the code of this PR but want to share some thoughts on the failure. Here is the program that is run: for arg in %O %A %B %L %P %S %X %Y %F; do echo $arg >> \"%A\"; done; cat \"%O\" \"%B\" >> \"%A\"" The failing output has 10 lines, whereas a valid output has 11 lines. Probably that is related. My feeling is that one of the arguments in the for loop substitutions is empty. It would disappear then, and everything is offset. From there it should become more obvious which parameter comes out empty, and from there it should become clear what the problem really is. I hope that helps. |
Yes, in the failing run, the |
A couple things didn't mention in #1862 (comment):
It doesn't look like that specific mechanism accounts for the failure, because in the failing case a As an early step (before even doing most of the testing described above), I tried adding The broader issue for me examining this specific test is that I am familiar neither with the details of
No problem. My goal that this is related to is to be able to fix some other bugs where we do not run scripts correctly, such as bugs in the special casing of scripts with shebangs when But that also means I risk become separated from the original bug-fixing goals, which this is peripheral to, if I shift my focus too much onto this. If possible, I'd like to avoid become too embroiled in the details of the
Edit: Reworded for clarity and reordered the sections. |
It attempts to make the arguments that the merge-driver was called with observable. I think there is no quoting because each of the arguments is known not to have spaces in it. However, the lack of quoting also makes empty arguments unobservable, a definitive shortcoming of the current implementation.
Here is the docs, but the short version is that it's a script into which various parameters can be substituted into - this substitution is performed by the the caller of the merge-driver. That caller does a bare string substitution, without any care for quotes or whitespace in general.
I definitely wouldn't want you to suffer unnecessarily. The key for me to solve this certainly is to reproduce the error. For now, it doesn't make much sense to me either, unless… it's a substitution engine, what if one of the arguments that it substitutes also contains substitutable characters? Probably that's not what's happening here though, as This is a patch that should tell exactly what's executed, and from there it would become more obvious what's really happening. diff --git a/gix-merge/src/blob/platform/merge.rs b/gix-merge/src/blob/platform/merge.rs
index 058a30128..324742a00 100644
--- a/gix-merge/src/blob/platform/merge.rs
+++ b/gix-merge/src/blob/platform/merge.rs
@@ -211,7 +211,7 @@ pub(super) mod inner {
cmd.extend_from_slice(token);
}
- Ok(merge::Command {
+ Ok(dbg!(merge::Command {
cmd: gix_command::prepare(gix_path::from_bstring(cmd))
.with_context(context)
.command_may_be_shell_script()
@@ -223,7 +223,7 @@ pub(super) mod inner {
current_path: ours_path,
ancestor: base_tmp,
other: theirs_tmp,
- })
+ }))
}
/// Return the configured driver program for use with [`Self::prepare_external_driver()`], or `Err` I hope that helps. |
Thanks--you're right that the actual arguments that are passed are important information. I had glossed over this because it did not look like they differed between the passing and failing environment, but that doesn't mean they aren't needed to know what's going on. The results, with the patch you gave that reveals the
That's a bit hard to read due to the way the diff --git a/gix-merge/src/blob/platform/merge.rs b/gix-merge/src/blob/platform/merge.rs
index 058a30128..39e27e51f 100644
--- a/gix-merge/src/blob/platform/merge.rs
+++ b/gix-merge/src/blob/platform/merge.rs
@@ -399,6 +399,7 @@ impl<'parent> PlatformRef<'parent> {
match self.configured_driver() {
Ok(driver) => {
let mut cmd = self.prepare_external_driver(driver.command.clone(), labels, context.clone())?;
+ panic!("program: {:?}\nargs: {:#?}", cmd.get_program(), cmd.get_args());
let status = cmd.status().map_err(|err| Error::SpawnExternalDriver {
cmd: format!("{:?}", cmd.cmd),
source: err, Then, when run from PowerShell, the output included:
When run from Git Bash, where the test passes when allowed to proceed, the output included this instead:
This is to say that only the randomly generated suffixes after On the main branch, in PowerShell, where the test also passes if allowed to proceed, the only differences besides those temporary files' names is the shell command:
State from previous runs in this experiment is not likely to be a contributing factor; I ran On main, diff --git a/gix-path/src/env/mod.rs b/gix-path/src/env/mod.rs
index 78e2da294..cd6203df7 100644
--- a/gix-path/src/env/mod.rs
+++ b/gix-path/src/env/mod.rs
@@ -51,7 +51,7 @@ pub fn shell() -> &'static OsStr {
// more readable messages, append literally with `/` separators. The path from
// `git --exec-path` will already have all `/` separators (and no trailing `/`)
// unless it was explicitly overridden to an unusual value via `GIT_EXEC_PATH`.
- raw_path.push("/usr/bin/sh.exe");
+ raw_path.push("/bin/sh.exe");
raw_path
})
.filter(|raw_path| { But I am reluctant to apply that change without understanding what it is about the effect of the Git for Windows shim that is helping, to know that this shim really is preferable rather than the test happening to hinge on a quirk that is different. One reason I am reluctant is that, under that change, if My guess is that the relevant effect of the shim is in changing the environment, but I am not certain it refrains entirely from adjusting arguments. (I think the shim does not adjust arguments directly. But even if that is right, I'll try to modify the command to report its arguments and environment in a way that they cannot be confused with anything else, such as by writing them to a file. However, since this will necessarily modifying the command or its environment, it is not certain to be successful at revealing the cause. Edit: A possible contributing factor is that I also have some When the shim is used, two directories belonging to the MSYS2 installation that is associated with the running shell, The reason I say C:\Users\ek\scoop\apps\git\2.48.1\usr\bin\sh.exe -c 'echo "$PATH" | tr ":" "\n" | sort' > ubp.txt
C:\Users\ek\scoop\apps\git\2.48.1\bin\sh.exe -c 'echo "$PATH" | tr ":" "\n" | sort' > bp.txt
git --no-pager diff --no-index ubp.txt bp.txt Where the output of that last command is: diff --git a/ubp.txt b/bp.txt
index 7390dca90..7190bd5b3 100644
--- a/ubp.txt
+++ b/bp.txt
@@ -1,3 +1,6 @@
+/mingw64/bin
+/usr/bin
+/c/Users/ek/bin
/c/Users/ek/scoop/apps/pwsh/7.5.0
/c/Program Files (x86)/VMware/VMware Workstation/bin
/c/Windows/system32 Note that this experiment is not equivalent to checking the |
This is very strange and puzzling. And that's particularly strange as there isn't even any argument to interpret, the values are quite literally baked in. All I can think of is poking around in the script, maybe by starting to execute it directly. It's pretty clear what is invoked in the working and non-working case, so that could be generalized to invocations that are repeatable on the command-line directly. From there it can be reduced and probed to see what exactly is causing the observable difference (assuming it reproduces in the first place). If it doesn't, we'd know that some environment variable is affecting it as well. And I say this without even implying that you spend more time on this, I am just sharing thoughts. |
This makes
git_path::env::shell()
more likely to work correctly on Windows, then attempts to use it ingix_command::Prepare
when running a command with a shell due touse_shell
being set totrue
. Full details are in the commit messages, though the changes are also summarized here.I think this is not ready to merge yet, due to failures that happen when running the tests locally from PowerShell, but that I cannot produce otherwise. I do not know what causes these failures, and I expect I may have difficulty getting this to a point where it would be ready to merge without input. Since the failures happen in a test that was fairly recently added, I figured you might just know what is going on.
I think
shell()
can and should be used forgix-command
. But if not then the reason will be relevant to other potential uses ofshell()
. In that case, assuming they leave enough good uses forshell()
that it is kept, I hope to discover and articulate those reasons in the documentation forshell()
. But I think most likely there is a bug or wrong test expectation somewhere that accounts for the local test failure, and that either the current implementation ofshell()
with the modifications included here, or something much like it, is suitable for broad use.Making
gix_path::env::shell()
more robustThis makes
gix_path::env::shell()
more likely to work correctly on Windows by:Checking that the path to a
sh.exe
associated with Git for Windows actually exists (and is either a regular file or a symbolic link that, when fully dereferenced, gives a file), because if that is not the case, then the fallback of just looking upsh.exe
using aPATH
search is more likely to succeed.This change is needed to use
shell()
more widely without breaking things on systems wheresh.exe
is available and usable but cannot be found in the usual location where Git for Windows supplies it.Using
/
rather than\
directory separators when appendingusr
,bin
, andsh.exe
components to the Git for Windows installation directory path obtained viagit --exec-path
and going up three components. This causes the path throughsh.exe
is run to use all/
separators except in the unusual situation thatGIT_EXEC_PATH
has been set to a path with\
separator. (This is unusual because people don't usually setGIT_EXEC_PATH
. It would probably be best, when setting it on Windows, to use/
separators, but I do not actually know how often people do that.)I believe this is only a minor improvement, in view of this path not being automatically passed through to the shell, as described in #1840 (reply in thread). But this change also makes tests in
gix-command
andgix-credentials
pass that would otherwise require more extensive modification related to\
escaping, when modifyinggix-command
to useshell()
.Using
gix_path::env::shell()
ingix-command
Before the changes in this PR,
gix-command
uses the relative pathsh.exe
on Windows. This works a significant fraction of the time already. In particular, the problems with the relative pathbash.exe
that make it frequently unusable to find a non-WSLbash
on Windows, described in #1359 (comment), do not apply tosh.exe
, since there is no Microsoft-providedsh.exe
related to WSL. However:sh.exe
is not always the best choice when present.sh.exe
that can be found inPATH
. For example, if the onlysh.exe
is supplied by Git for Windows, and neither of the Git for Windows directories that contain ansh.exe
binary are inPATH
, then before the changes in this PR, a shell will not be found unlessshell_program()
is called (every time a command withuse_shell
is run).Therefore, this uses
shell()
, with the improvements described above, ingix-command
.As alluded to above, I think this is not ready yet. All tests pass on CI, as well a when run locally from Git Bash. But when running them locally in PowerShell,
gix-merge::merge blob::platform::merge::with_external
fails. The failure does not requireGIX_TEST_IGNORE_ARCHIVES
to be set, and this PR does not currently modify any test fixture scripts.Windows failures that occur when tests are run from PowerShell but not from Git Bash have usually been due to #1359 and are thus not expected since #1712, where
gix-testtools
findsbash.exe
associated with Git for Windows and uses it to run fixtures. But that fix didn't affect the waygix-command
uses a shell, so it does not provide for the kind of shell invocations occurring here.Still, it doesn't make sense to me why the failure happens, and why it happens only when I run the tests locally from PowerShell. On this system, in the
PATH
, including when accessed from PowerShell, the commandsh
finds is a shim for the same shell that, as of this PR,gix-path
finds andgix-command
uses. Furthermore, the change here makesgix-path
findsh.exe
more like the waygix-testtools
findsbash.exe
, both of which are Bash when provided by Git for Windows.Furthermore, while most of my local testing has been with Git for Windows 2.48.1, I have verified that the test fails the same way on the same system with Git for Windows 2.47.1(2), and that the experiment described below to examine the details of the failure also gives the same results with that version. This is to say that it seems like this failure is not related, even conceptually, to #1849.
The failing assertion is the first of the assertions in this fragment of the failing test:
gitoxide/gix-merge/tests/merge/blob/platform.rs
Lines 291 to 314 in 2efce72
The first three of the cleaned driver lines are supposed to contain
.tmp
, but the first one is justb
, which is expected to apper later. This somehow varies based on how the tests are run, and there are at least three environments for figuring out what's going on: CI, local in Git Bash, and local in PowerShell. It seems like they shouldn't differ, but they do.Locally, it is possible to inspect things in a debugger, but this is harder on CI because while action-tmate can be used to get a reverse shell, on Windows runners the reverse shell's environment differs from the noninteractive environment in which the tests run. Fortunately, I can break the tests in a way that makes them always fail by panicking and printing all the lines:
To be clear, while all of the following are from messages printed in failing tests, the tests fail because of that change. All of them would otherwise pass, except the local run, in Windows, from PowerShell, shown last. Also, I do not know why this is happening. I am hoping you may have some insight into it.
Based off the main branch, on CI, in Ubuntu (would pass)
In the test-fast (ubuntu-latest) run, for comparison:
Based off the main branch, on CI, in Windows (would pass)
In the test-fast (windows-latest) run:
Note how backslashes are being treated as escape characters and removed, which seems like it is not intended, but this does not in and of itself cause a problem with this test: it happens even on the main branch, and even without the changes from this PR.
Based off the main branch, locally, in Windows, from Git Bash (would pass)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Based off the main branch, locally, in Windows, from PowerShell (would pass)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Based off this feature branch, on CI, in Ubuntu (would pass)
In the test-fast (ubuntu-latest) run, for comparison:
That is the same as based off the main branch, which is very much as expected, because the changes on this feature branch do not produce a different value for
gix_path::env::shell()
on systems other than Windows.Based off this feature branch, on CI, on Windows (would pass)
In the test-fast (windows-latest) run:
This is similar to the main branch CI and local Git Bash runs shown above.
Based off this feature branch, locally, in Windows, from Git Bash (would pass)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Based off this feature branch, locally, in Windows, from PowerShell (would fail)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Note how
b
comes first, and the three lines with paths to files named containing.tmp
follow it. This causes this assertion, which is run for the first three lines, to fail in the first iteration (i.e. for the first line):gitoxide/gix-merge/tests/merge/blob/platform.rs
Line 297 in 2efce72
But I do not understand why. I'm not very familiar with the details of
gix-merge
. I'm also not entirely clear on which kinds of shell transformations (from parsing%
items into fields, from shell expansions on unquoted parameter expansion, and from the implicit effect of joining on spaces when passing multiple arguments arising from an unquoted expansion toecho
) are intended to occur in the test command, and with what effects.Is this likely to be caused by different environments across shells run in different ways? Or is it somehow due to the name the shell is called with, even though that does not become its
$0
here and, furthermore, its$0
does not seem to be expanded? What could be going on?